Economic Freedom Index

The Index of Economic Freedom is a ranking created to measure the economic freedom in the countries of the world. Now, in its 25th edition, The Economic Freedom Index is poised to help readers track over two decades of the advancement in economic freedom, prosperity, and opportunity and promote these ideas in their homes, schools, and communities. The Index covers 12 freedoms, from property rights to financial freedom, in 186 countries.

Import Libraries

Load the file

Let's print the head of the dataframe

Let's print the tail of the dataframe

Take a backup of the DataFrame

Let's print the shape of the dataframe

Let's describe the data frame to see the mean, std etc

Let's see the column types

Fix the spaces and other characters in column names

Dropping columns that are either duplicate / not adding any information

Let's see the head of the dataframe after column alterations

Let's see the tail of the dataframe after column alterations

Data Preprocessing

Check for null values and replace them with median if feasible

As shown above, there are lot of null values and they need to be imputed and/or replaced with the right values

Let's remove the rows which have null values as they are very small in number

Let's convert the columns with an 'object' datatype into float variables

Let's convert the categorical columns into Integer variables as the values are numeric

Exploratory Data Analysis - Univariate

Observations on World_Rank

As shown above, the World_Rank has no outliers

Observations on Region_Rank

As shown above, Region_Rank also doesn't have any outliers

Observations on Judical_Effectiveness

As shown above, Judicial_Effectiveness has some outliers but let's ignore it considering the size of the dataset

Observations on Government_Integrity

As shown above, Government_Integrity has some outliers which might need to be treated - let's hold on to it for now

Observations on Tax_Burden

Observations on Investment_Freedom

Observations on Financial_Freedom

Observations on Tariff_Rate

Observations on Income_Tax_Rate

Observations on Inflation

Exploratory Data Analysis - Bivariate

As shown above,

Investment_Freedom vs Financial_Freedom and Region

On an average, Investment_Freedom and Financial_Freedom are more better in Europe region compared to Americas

Region vs Judical_Effectiveness and Government_Integrity

On an average, Judicial_Effectiveness and Government_Integrity is better in Europe region compared to others. There are some outliers in African countries

Region vs Gov't_Spending and Tax_Burden

As shown above

Region vs Income_Tax_Rate and Corporate_Tax_Rate

As shown above, Income_Tax_Rate is higher in Sub-Saharan Africa while the Corporate_tax_rate is also similarly high in the same region

CDF Plot of Numerical Variables

As shown above, the number of variables are huge so this bivariate analysis might not throw more information except that there are patterns in the correlation between them (as seen in the grouping of the dots)

Scaling

K-Means Clustering

The appropriate value for K from the elbow curve seems to be 2 or 3

Let's check the Silhouette scores

From the silhouette scores, it seems that 2 is a good value for k.

Observations

Hierarchical Clustering

Let's explore different linkage methods with Euclidean distance only

We see that the cophenetic correlation is maximum with Euclidean distance and average linkage

Let's view the dendrograms for the different linkage methods

Let's move ahead with 2 clusters, Euclidean distance, and average linkage

Cluster Profiling and Comparison

Cluster Profiling: K-means Clustering

Cluster Profiling: Hierarchical Clustering

K-Means Clustering vs Hierarchical Clustering Comparison

Let's create some plots on the original data to understand the distribution among the clusters

Cluster Comparison

Cluster 0

Cluster 1

Cluster Overlapping

PCA for Visualization

HC Clusters doesn't seem to be separated well